The search functionality is under construction.

Author Search Result

[Author] Bo LI(94hit)

41-60hit(94hit)

  • An Inductive-Coupling Interconnected Application-Specific 3D NoC Design

    Zhen ZHANG  Shouyi YIN  Leibo LIU  Shaojun WEI  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E96-A No:12
      Page(s):
    2633-2644

    TSV-interconnected 3D chips face problems such as high cost, low yield and large power dissipation. We propose a wireless 3D on-chip-network architecture for application-specific SoC design, using inductive-coupling interconnect instead of TSV for inter-layer communication. Primary design challenge of inductive-coupling 3D SoC is allocating wireless links in the 3D on-chip network effectively. We develop a design flow fully exploiting the design space brought by wireless links while providing flexible tradeoff for user's choice. Experimental results show that our design brings great improvement over uniform design and Sunfloor algorithm on latency (5% to 20%) and power consumption (10% to 45%).

  • On Maximizing the Lifetime of Wireless Sensor Networks in 3D Vegetation-Covered Fields

    Wenjie YU  Xunbo LI  Zhi ZENG  Xiang LI  Jian LIU  

     
    LETTER-Fundamentals of Information Systems

      Pubricized:
    2018/03/01
      Vol:
    E101-D No:6
      Page(s):
    1677-1681

    In this paper, the problem of lifetime extension of wireless sensor networks (WSNs) with redundant sensor nodes deployed in 3D vegetation-covered fields is modeled, which includes building communication models, network model and energy model. Generally, such a problem cannot be solved by a conventional method directly. Here we propose an Artificial Bee Colony (ABC) based optimal grouping algorithm (ABC-OG) to solve it. The main contribution of the algorithm is to find the optimal number of feasible subsets (FSs) of WSN and assign them to work in rotation. It is verified that reasonably grouping sensors into FSs can average the network energy consumption and prolong the lifetime of the network. In order to further verify the effectiveness of ABC-OG, two other algorithms are included for comparison. The experimental results show that the proposed ABC-OG algorithm provides better optimization performance.

  • An Implementation of Multiple-Standard Video Decoder on a Mixed-Grained Reconfigurable Computing Platform

    Leibo LIU  Dong WANG  Yingjie CHEN  Min ZHU  Shouyi YIN  Shaojun WEI  

     
    PAPER-Computer System

      Pubricized:
    2016/02/02
      Vol:
    E99-D No:5
      Page(s):
    1285-1295

    This paper presents the design of a multiple-standard 1080 high definition (HD) video decoder on a mixed-grained reconfigurable computing platform integrating coarse-grained reconfigurable processing units (RPUs) and FPGAs. The proposed RPU, including 16×16 multi-functional processing elements (PEs), is used to accelerate compute-intensive tasks in the video decoding. A soft-core-based microprocessor array is implemented on the FPGA and adopted to speed-up the dynamic reconfiguration of the RPU. Furthermore, a mail-box-based communication scheme is utilized to improve the communication efficiency between RPUs and FPGAs. By exploiting dynamic reconfiguration of the RPUs and static reconfiguration of the FPGAs, the proposed platform achieves scalable performances and cost trade-offs to support a variety of video coding standards, including MPEG-2, AVS, H.264, and HEVC. The measured results show that the proposed platform can support H.264 1080 HD video streams at up to 57 frames per second (fps) and HEVC 1080 HD video streams at up to 52fps under 250MHz, at the same time, it achieves a 3.6× performance gain over an industrial coarse-grained reconfigurable processor for H.264 decoding, and a 6.43× performance boosts over a general purpose processor based implementation for HEVC decoding.

  • On Finding Maximum Disjoint Paths for Many-to-One Routing in Wireless Multi-Hop Network

    Bo LIU  Junzhou LUO  Feng SHAN  Wei LI  Jiahui JIN  Xiaojun SHEN  

     
    PAPER

      Vol:
    E97-D No:10
      Page(s):
    2632-2640

    Provisioning multiple paths can improve fault tolerance and transport capability of multi-routing in wireless networks. Disjoint paths can improve the diversity of paths and further reduce the risk of simultaneous link failure and network congestion. In this paper we first address a many-to-one disjoint-path problem (MOND) for multi-path routing in a multi-hop wireless network. The objective of this problem is to maximize the minimum number of disjoint paths of every source to the destination. We prove that it is NP-hard to obtain k disjoint paths for every source when k ≥ 3. To solve this problem efficiently, we propose a heuristic algorithm called TOMAN based on network flow theory. Experimental results demonstrate that it outperforms three related algorithms.

  • Battery-Aware Loop Nests Mapping for CGRAs

    Yu PENG  Shouyi YIN  Leibo LIU  Shaojun WEI  

     
    PAPER-Architecture

      Vol:
    E98-D No:2
      Page(s):
    230-242

    Coarse-grained Reconfigurable Architecture (CGRA) is a promising mobile computing platform that provides both high performance and high energy efficiency. In an application, loop nests are usually mapped onto CGRA for further acceleration, so optimizing the mapping is an important goal for design of CGRAs. Moreover, obviously almost all of mobile devices are powered by batteries, how to reduce energy consumption also becomes one of primary concerns in using CGRAs. This paper makes three contributions: a) Proposing an energy consumption model for CGRA; b) Formulating loop nests mapping problem to minimize the battery charge loss; c) Extract an efficient heuristic algorithm called BPMap. Experiment results on most kernels of the benchmarks and real-life applications show that our methods can improve the performance of the kernels and lower the energy consumption.

  • Parallelization of Computing-Intensive Tasks of SIFT Algorithm on a Reconfigurable Architecture System

    Peng OUYANG  Shouyi YIN  Hui GAO  Leibo LIU  Shaojun WEI  

     
    PAPER

      Vol:
    E96-A No:6
      Page(s):
    1393-1402

    Scale Invariant Feature Transform (SIFT) algorithm is a very excellent approach for feature detection. It is characterized by data intensive computation. The current studies of accelerating SIFT algorithm are mainly reflected in three aspects: optimizing the parallel parts of the algorithm based on general-purpose multi-core processors, designing the customized multi-core processor dedicated for SIFT, and implementing it based on the FPGA platform. The real-time performance of SIFT has been highly improved. However, the factors such as the input image size, the number of octaves and scale factors in the SIFT algorithm are restricted for some solutions, the flexibility that ensures the high execution performance under variable factors should be improved. This paper proposes a reconfigurable solution to solve this problem. We fully exploit the algorithm and adopt several techniques, such as full parallel execution, block computation and CORDIC transformation, etc., to improve the execution efficiency on a REconfigurable MUltimedia System called REMUS. Experimental results show that the execution performance of the SIFT is improved by 33%, 50% and 8 times comparing with that executed in the multi-core platform, FPGA and ASIC separately. The scheme of dynamic reconfiguration in this work can configure the circuits to meet the computation requirements under different input image size, different number of octaves and scale factors in the process of computing.

  • Single Parameter Logarithmic Image Processing for Edge Detection

    Fuji REN  Bo LI  Qimei CHEN  

     
    PAPER-Image Processing and Video Processing

      Vol:
    E96-D No:11
      Page(s):
    2437-2449

    Considering the non-linear properties of the human visual system, many non-linear operators and models have been developed, particularly the logarithmic image processing (LIP) model proposed by Jourlin and Pinoli, which has been proved to be physically justified in several laws of the human visual system and has been successfully applied in image processing areas. Recently, several modifications based on this logarithmic mathematical framework have been presented, such as parameterized logarithmic image processing (PLIP), pseudo-logarithmic image processing, homomorphic logarithmic image processing. In this paper, a new single parameter logarithmic model for image processing with an adaptive parameter-based Sobel edge detection algorithm is presented. On the basis of analyzing the distributive law, the subtractive law, and the isomorphic property of the PLIP model, the five parameters in PLIP are replaced by a single parameter to ensure the completeness of the model and physical constancy with the nature of an image, and then an adaptive parameter-based Sobel edge detection algorithm is proposed. By using an image noise estimation method to evaluate the noise level of image, the adaptive parameter in the single parameter LIP model is calculated based on the noise level and grayscale value of a corresponding image area, followed by the single-parameter LIP-based Sobel operation to overcome the noise-sensitive problem of classical LIP-based Sobel edge detection methods, especially in the dark area of an image, while retaining edge sensitivity. Compared with the classical LIP and PLIP model, the given single parameter LIP achieves satisfactory results in noise suppression and edge accuracy.

  • SCUT-AutoALP: A Diverse Benchmark Dataset for Automatic Architectural Layout Parsing

    Yubo LIU  Yangting LAI  Jianyong CHEN  Lingyu LIANG  Qiaoming DENG  

     
    LETTER-Computer Graphics

      Pubricized:
    2020/09/03
      Vol:
    E103-D No:12
      Page(s):
    2725-2729

    Computer aided design (CAD) technology is widely used for architectural design, but current CAD tools still require high-level design specifications from human. It would be significant to construct an intelligent CAD system allowing automatic architectural layout parsing (AutoALP), which generates candidate designs or predicts architectural attributes without much user intervention. To tackle these problems, many learning-based methods were proposed, and benchmark dataset become one of the essential elements for the data-driven AutoALP. This paper proposes a new dataset called SCUT-AutoALP for multi-paradigm applications. It contains two subsets: 1) Subset-I is for floor plan design containing 300 residential floor plan images with layout, boundary and attribute labels; 2) Subset-II is for urban plan design containing 302 campus plan images with layout, boundary and attribute labels. We analyzed the samples and labels statistically, and evaluated SCUT-AutoALP for different layout parsing tasks of floor plan/urban plan based on conditional generative adversarial networks (cGAN) models. The results verify the effectiveness and indicate the potential applications of SCUT-AutoALP. The dataset is available at https://github.com/designfuturelab702/SCUT-AutoALP-Database-Release.

  • Hardware Software Co-design of H.264 Baseline Encoder on Coarse-Grained Dynamically Reconfigurable Computing System-on-Chip

    Hung K. NGUYEN  Peng CAO  Xue-Xiang WANG  Jun YANG  Longxing SHI  Min ZHU  Leibo LIU  Shaojun WEI  

     
    PAPER-Computer System

      Vol:
    E96-D No:3
      Page(s):
    601-615

    REMUS-II (REconfigurable MUltimedia System 2) is a coarse-grained dynamically reconfigurable computing system for multimedia and communication baseband processing. This paper proposes a real-time H.264 baseline profile encoder on REMUS-II. First, we propose an overall mapping flow for mapping algorithms onto the platform of REMUS-II system and then illustrate it by implementing the H.264 encoder. Second, parallel and pipelining techniques are considered for fully exploiting the abundant computing resources of REMUS-II, thus increasing total computing throughput and solving high computational complexity of H.264 encoder. Besides, some data-reuse schemes are also used to increase data-reuse ratio and therefore reduce the required data bandwidth. Third, we propose a scheduling scheme to manage run-time reconfiguration of the system. The scheduling is also responsible for synchronizing the data communication between tasks and handling conflict between hardware resources. Experimental results prove that the REMUS-MB (REMUS-II version for mobile applications) system can perform a real-time H.264/AVC baseline profile encoder. The encoder can encode CIF@30 fps video sequences with two reference frames and maximum search range of [-16,15]. The implementation, thereby, can be applied to handheld devices targeted at mobile multimedia applications. The platform of REMUS-MB system is designed and synthesized by using TSMC 65 nm low power technology. The die size of REMUS-MB is 13.97 mm2. REMUS-MB consumes, on average, about 100 mW while working at 166 MHz. To my knowledge, in the literature this is the first implementation of H.264 encoding algorithm on a coarse-grained dynamically reconfigurable computing system.

  • New Perfect Gaussian Integer Sequences from Cyclic Difference Sets

    Tao LIU  Chengqian XU  Yubo LI  Kai LIU  

     
    LETTER-Information Theory

      Vol:
    E100-A No:12
      Page(s):
    3067-3070

    In this letter, three constructions of perfect Gaussian integer sequences are constructed based on cyclic difference sets. Sufficient conditions for constructing perfect Gaussian integer sequences are given. Compared with the constructions given by Chen et al. [12], the proposed constructions relax the restrictions on the parameters of the cyclic difference sets, and new perfect Gaussian integer sequences will be obtained.

  • Facial Expression Recognition via Regression-Based Robust Locality Preserving Projections

    Jingjie YAN  Bojie YAN  Ruiyu LIANG  Guanming LU  Haibo LI  Shipeng XIE  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2017/11/06
      Vol:
    E101-D No:2
      Page(s):
    564-567

    In this paper, we present a novel regression-based robust locality preserving projections (RRLPP) method to effectively deal with the issue of noise and occlusion in facial expression recognition. Similar to robust principal component analysis (RPCA) and robust regression (RR) approach, the basic idea of the presented RRLPP approach is also to lead in the low-rank term and the sparse term of facial expression image sample matrix to simultaneously overcome the shortcoming of the locality preserving projections (LPP) method and enhance the robustness of facial expression recognition. However, RRLPP is a nonlinear robust subspace method which can effectively describe the local structure of facial expression images. The test results on the Multi-PIE facial expression database indicate that the RRLPP method can effectively eliminate the noise and the occlusion problem of facial expression images, and it also can achieve better or comparative facial expression recognition rate compared to the non-robust and robust subspace methods meantime.

  • New Families of Almost Binary Sequences with Optimal Autocorrelation Property

    Xiuping PENG  Hongbin LIN  Yanmin LIU  Xiaoyu CHEN  Xiaoxia NIU  Yubo LI  

     
    LETTER-Coding Theory

      Vol:
    E102-A No:2
      Page(s):
    467-470

    Two new families of balanced almost binary sequences with a single zero element of period L=2q are presented in this letter, where q=4d+1 is an odd prime number. These sequences have optimal autocorrelation value or optimal autocorrelation magnitude. Our constructions are based on cyclotomy and Chinese Remainder Theorem.

  • Layout-Aware Variability Characterization of CMOS Current Sources

    Bo LIU  Bo YANG  Shigetoshi NAKATAKE  

     
    PAPER

      Vol:
    E95-C No:4
      Page(s):
    696-705

    Current sources are essential components for analog circuit designs, the mismatch of which causes the significant degradation of the circuit performance. This paper addresses the mismatch model of CMOS current sources, unlike the conventional modeling, focusing on the layout- and λ-dependency of the process variation, where λ is the output conductance parameter. To make it clear what variation parameter influences the mismatch, we implemented a test chip on 90 nm process technology, where we can collect the characteristics variation data for MOSFETs of various layouts. The test chip also includes D/A converters to check the differential non-linearity (DNL) caused by the mismatch of current sources when behaving as a DAC. Identifying the variation and the circuit-level errors in the measured DNLs, we reveal that our model can more accurately account for the current variation compared to the conventional mismatch model.

  • New Families of Quaternary Sequences of Period 2p with Low Autocorrelation

    Xiaofei SONG  Yanguo JIA  Xiumin SHEN  Yubo LI  Xiuping PENG  

     
    LETTER-Coding Theory

      Vol:
    E101-A No:11
      Page(s):
    1964-1969

    In this letter, two new families of quaternary sequences with low four-level or five-level autocorrelation are constructed based on generalized cyclotomy over Z2p. These quaternary sequences are balanced and the maximal absolute value of the out-of-phase autocorrelation is 4.

  • Constructions of Binary Sequence Pairs of Length 5q with Optimal Three-Level Correlation

    Xiumin SHEN  Xiaofei SONG  Yanguo JIA  Yubo LI  

     
    LETTER-Coding Theory

      Pubricized:
    2021/04/14
      Vol:
    E104-A No:10
      Page(s):
    1435-1439

    Binary sequence pairs with optimal periodic correlation have important applications in many fields of communication systems. In this letter, four new families of binary sequence pairs are presented based on the generalized cyclotomy over Z5q, where q ≠ 5 is an odd prime. All these binary sequence pairs have optimal three-level correlation values {-1, 3}.

  • Micro Recording Performance Fluctuation and Magnetic Roughness Analysis: Methodology and Application

    Bo LIU  Wei ZHANG  Sheng-Bin HU  

     
    PAPER

      Vol:
    E83-C No:9
      Page(s):
    1530-1538

    As technology moves at an annual area density increase rate of 80-120% and channel density moves beyond 3, micro-fluctuation of media recording performance and the homogeneity of media's recording capability become serious reliability concerns in future high density magnetic recording systems. Two concepts are proposed in this work for the characterization of the micro-recording performance fluctuation at high bit and channel densities: recording performance roughness analysis and dynamic magnetic roughness analysis. The recording performance roughness analysis is based on an in-situ measurement technique of the non-linear transition shift (NLTS). Relationship between the performance roughness and the roughness of dynamic magnetic parameters are studied. Results of experimental investigations indicate that the NLTS based performance roughness analysis can reveal more details on media's recording capability and the capability fluctuation--the macro and micro fluctuation of recording performance. The dynamic magnetic roughness analysis is read/write operation based and can be used to characterize the macro and micro fluctuation of media's dynamic magnetic properties. The parameters used for the analysis include media's dynamic coercivity and the dynamic coercive squareness. Here, "dynamic" refers to the dynamic performance measured at MHz frequency. The authors also noticed in their technology development process that further methodology development and confirmation are necessary for media's dynamic performance analysis. Therefore, the work also extends to the accuracy analysis of the playback amplitude based methods for the analysis of the dynamic coercive squareness and dynamic hysteresis loop. A method which is of smaller testing error is identified and reported in this work.

  • Least Squares Constant Modulus Blind Adaptive Beamforming with Sparse Constraint

    Jun LI  Hongbo XU  Hongxing XIA  Fan LIU  Bo LI  

     
    LETTER-Antennas and Propagation

      Vol:
    E95-B No:1
      Page(s):
    313-316

    Beamforming with sparse constraint has shown significant performance improvement. In this letter, a least squares constant modulus blind adaptive beamforming with sparse constraint is proposed. Simulation results indicate that the proposed approach exhibits better performance than the well-known least squares constant modulus algorithm (LSCMA).

  • On Attractive Force of Evanescent Electromagnetic Field on Dielectric Slab*

    Jingbo LI  Masahiro AGU  

     
    PAPER

      Vol:
    E79-C No:10
      Page(s):
    1308-1311

    The electromagnetic force of evanescent field acting on dielectric slab is studied with the use of Maxwell stress tensor. The results show that dielectrics slab may receive always an attractive force when the incident wave is evanescent field while a pressure or an attractive force when the wave is propagating one. The magnitude of the attractive force by evanescent field is much larger than that of the propagating wave. And here some numerical examples are given.

  • Maximizing the Profit of Datacenter Networks with HPFF

    Bo LIU  Hui HU  Chao HU  Bo XU  Bing XU  

     
    LETTER-Information Network

      Pubricized:
    2017/04/05
      Vol:
    E100-D No:7
      Page(s):
    1534-1537

    Maximizing the profit of datacenter networks (DCNs) demands to satisfy more flows' requirements simultaneously, but existing schemes always allocate resource based on single flow attribute, which cannot carry out accurate resource allocation and make many flows failed. In this letter, we propose Highest Priority Flow First (HPFF) to maximize DCN profit, which allocates resource for flows according to the priority. HPFF employs a utility function that considers multiple flow attributes, including flow size, deadline and demanded bandwidth, to calculate the priority for each flow. The experiments on the testbed show that HPFF can improve the network profit by 6.75%-19.7% and decrease the number of failed flow by 26.3%-83.3% compared with existing schemes under real DCN workloads.

  • Hybrid Wired/Wireless On-Chip Network Design for Application-Specific SoC

    Shouyi YIN  Yang HU  Zhen ZHANG  Leibo LIU  Shaojun WEI  

     
    PAPER

      Vol:
    E95-C No:4
      Page(s):
    495-505

    Hybrid wired/wireless on-chip network is a promising communication architecture for multi-/many-core SoC. For application-specific SoC design, it is important to design a dedicated on-chip network architecture according to the application-specific nature. In this paper, we propose a heuristic wireless link allocation algorithm for creating hybrid on-chip network architecture. The algorithm can eliminate the performance bottleneck by replacing multi-hop wired paths by high-bandwidth single-hop long-range wireless links. The simulation results show that the hybrid on-chip network designed by our algorithm improves the performance in terms of both communication delay and energy consumption significantly.

41-60hit(94hit)